CHAPTER 8 Getting Your Data into the Computer 103
articles, and because teachers love to include them on tests. The level of measure-
ment of variables impacts how and to what precision data are collected. Other
level-of-measurement considerations include minimizing the data collected to
only what is needed, which also reduces data-privacy concerns and cost. And,
more practically, knowing the level of measurement of a variable can help you
choose the most appropriate way to analyze that variable.
Classifying and Recording
Different Kinds of Data
Although you should be aware of the four levels of measurement described in the
preceding section, you also need to be able to classify and deal with data in a more
pragmatic way. The following sections describe various common types of data
you’re likely to encounter in the course of clinical and other research. We point
out some considerations you need to think through before you start collecting
your data.
Making bad decisions (or avoiding making decisions) about exactly how to repre-
sent the data values in your research database can mess it up, and quite possibly
doom the entire study to eventual failure. If you record the values to your variables
the wrong way in your data, it may take an enormous amount of additional effort
to go back and fix them, and depending upon the error, a fix may not even be
possible!
Dealing with free-text data
It’s best to limit free-text variables that are difficult to box into one of the four
levels of measurement, such as participant comments or write-in fields for Other
choices in a questionnaire. Basically, you should only collect free-text variables
when you need to record verbatim what someone said or wrote. Don’t use
free-text fields as a lazy-person’s substitute for what should be precisely defined
categorical data. Doing any meaningful statistical analysis of free-text fields is
generally very difficult, if not impossible.
You should also be aware that most software has field-length limitations for text
fields. Although commonly used statistical programs like Microsoft Excel, SPSS,
SAS, R, and Python may allow for long data fields, this does not excuse you from
designing your study so as to limit collection of free-text variables. Flip to
Chapter 4 for an introduction to statistical software.